智能论文笔记

CP-AGCN: Pytorch-based Attention Informed Graph Convolutional Network for Identifying Infants at Risk of Cerebral Palsy

Haozheng Zhang , Edmond S. L. Ho , Hubert P. H. Shum

分类：计算机视觉 | 机器学习

2022-09-06

早期预测在临床上被认为是脑瘫（CP）治疗的重要部分之一。我们建议实施一个基于一般运动评估（GMA）的CP预测的低成本和可解释的分类系统。我们设计了一个基于Pytorch的注意力图形卷积网络，以识别从RGB视频中提取的骨骼数据中有CP风险的早期婴儿。我们还设计了一个频率模块，用于在过滤噪声时学习频域中的CP运动。我们的系统仅需要消费级RGB视频进行培训，以通过提供可解释的CP分类结果来支持交互式时间CP预测。

translated by 谷歌翻译

A Two-stream Convolutional Network for Musculoskeletal and Neurological Disorders Prediction

Manli Zhu , Qianhui Men , Edmond S. L. Ho , Howard Leung , Hubert P. H. Shum

分类：计算机视觉

2022-08-18

肌肉骨骼和神经系统疾病是老年人行走问题的最常见原因，它们通常导致生活质量降低。分析步行运动数据手动需要训练有素的专业人员，并且评估可能并不总是客观的。为了促进早期诊断，最近基于深度学习的方法显示了自动分析的有希望的结果，这些方法可以发现传统的机器学习方法中未发现的模式。我们观察到，现有工作主要应用于单个联合特征，例如时间序列的联合职位。由于发现了诸如通常较小规模的医疗数据集的脚之间的距离（即步幅宽度）之类的挑战，因此这些方法通常是优选的。结果，我们提出了一种解决方案，该解决方案明确地将单个关节特征和关节间特征作为输入，从而使系统免于从小数据中发现更复杂的功能。由于两种特征的独特性质，我们引入了一个两流框架，其中一个流从关节位置的时间序列中学习，另一个从相对关节位移的时间序列中学习。我们进一步开发了一个中层融合模块，以将发现的两个流中发现的模式结合起来进行诊断，从而导致数据互补表示，以获得更好的预测性能。我们使用3D骨架运动的基准数据集涉及45例肌肉骨骼和神经系统疾病的患者，并实现95.56％的预测准确性，效果优于最先进的方法，从而验证了我们的系统。

translated by 谷歌翻译

A Feasibility Study on Image Inpainting for Non-cleft Lip Generation from Patients with Cleft Lip

Shuang Chen , Amir Atapour-Abarghouei , Jane Kerby , Edmond S. L. Ho , David C. G. Sainsbury , Sophie Butterworth , Hubert P. H. Shum

分类：计算机视觉

2022-08-01

唇裂是一种先天性异常，需要专家手术修复。外科医生必须具有丰富的经验和理论知识才能进行手术，并且已经提出了人工智能（AI）方法来指导外科医生改善手术结局。如果可以使用AI来预测修复的唇唇的外观，那么外科医生可以将其用作辅助手术技术来调整其手术技术并改善结果。为了在保护患者隐私时探索这个想法的可行性，我们提出了一种基于深度学习的图像镶嵌方法，该方法能够覆盖唇裂，并产生唇彩，而无需裂缝。我们的实验是在两个现实世界中的裂口数据集上进行的，并由专家cleft唇外科医生评估，以证明该方法的可行性。

translated by 谷歌翻译

Interaction Mix and Match: Synthesizing Close Interaction using Conditional Hierarchical GAN with Multi-Hot Class Embedding

Aman Goel , Qianhui Men , Edmond S. L. Ho

分类：计算机视觉

2022-07-23

由于字符之间的复杂和多样化的交互作用，合成的多字符交互是一项艰巨的任务。特别是，在产生诸如舞蹈和战斗之类的紧密互动时，需要精确的时空对齐。现有的生成多字符相互作用的工作集中在给定序列中生成单一类型的反应运动，从而导致缺乏各种结果动作。在本文中，我们提出了一种新颖的方式来创建现实的人类反应动作，通过混合和匹配不同类型的紧密相互作用，在给定数据集中未呈现。我们提出了一个有条件的层次生成对抗网络，具有多热的类嵌入，以从领导者的给定运动序列中生成追随者的混合和匹配反应性运动。实验是对嘈杂（基于深度）和高质量（基于MOCAP）的交互数据集进行的。定量和定性结果表明，我们的方法的表现优于给定数据集上的最新方法。我们还提供了一个增强数据集，具有逼真的反应动作，以刺激该领域的未来研究。该代码可从https://github.com/aman-goel1/imm获得

translated by 谷歌翻译

Pose-based Tremor Classification for Parkinson's Disease Diagnosis from Video

Haozheng Zhang , Edmond S. L. Ho , Xiatian Zhang , Hubert P. H. Shum

分类：计算机视觉 | 机器学习

2022-07-14

帕金森氏病（PD）是一种进行性神经退行性疾病，导致各种运动功能障碍症状，包括震颤，胸肌，僵硬和姿势不稳定。 PD的诊断主要取决于临床经验，而不是确定的医学测试，诊断准确性仅为73-84％，因为它受到不同医学专家的主观意见或经验的挑战。因此，有效且可解释的自动PD诊断系统对于支持更强大的诊断决策的临床医生很有价值。为此，我们建议对帕金森的震颤进行分类，因为它是PD的最主要症状之一，具有强烈的普遍性。与其他计算机辅助时间和资源消耗的帕金森震颤（PT）分类系统不同，我们提出了SPAPNET，该系统仅需要消费者级的非侵入性视频记录人类面向摄像机的动作作为输入，以提供无诊断低成本分类的患者作为PD警告标志。我们首次提议使用带有轻质金字塔通道 - 融合式结构的新型注意模块来提取相关的PT信息并有效地过滤噪声。这种设计有助于提高分类性能和系统的解释性。实验结果表明，我们的系统在将PT与非PT类别分类中的平衡精度达到90.9％和90.6％的F1得分来胜过最先进的。

translated by 谷歌翻译

Interaction-aware Decision-making for Automated Vehicles using Social Value Orientation

Luca Crosato , Hubert P. H. Shum , Edmond S. L. Ho , Chongfeng Wei

分类：机器人

2022-07-12

行人在场的运动控制算法对于开发安全可靠的自动驾驶汽车（AV）至关重要。传统运动控制算法依赖于手动设计的决策政策，这些政策忽略了AV和行人之间的相互作用。另一方面，深度强化学习的最新进展允许在没有手动设计的情况下自动学习政策。为了解决行人在场的决策问题，作者介绍了一个基于社会价值取向和深入强化学习（DRL）的框架，该框架能够以不同的驾驶方式生成决策政策。该政策是在模拟环境中使用最先进的DRL算法培训的。还引入了适合DRL训练的新型计算效率的行人模型。我们执行实验以验证我们的框架，并对使用两种不同的无模型深钢筋学习算法获得的策略进行了比较分析。模拟结果表明，开发的模型如何表现出自然的驾驶行为，例如短暂的驾驶行为，以促进行人的穿越。

translated by 谷歌翻译

A Skeleton-aware Graph Convolutional Network for Human-Object Interaction Detection

Manli Zhu , Edmond S. L. Ho , Hubert P. H. Shum

分类：计算机视觉 | 人工智能

2022-07-11

检测人对象相互作用对于全面了解视觉场景至关重要。特别是，人与物体之间的空间连接是推理相互作用的重要提示。为此，我们提出了一个用于人类对象相互作用检测的骨骼感知图卷积网络，称为SGCN4HOI。我们的网络利用了人类关键点和对象关键点之间的空间连接，以通过图卷积捕获其细粒的结构相互作用。它将此类几何特征与视觉特征和空间配置特征融合在一起，并从人类对象对获得。此外，为了更好地保留对象结构信息并促进人类对象的相互作用检测，我们提出了一种新型的基于骨架的对象关键点表示。 SGCN4HOI的性能在公共基准V-Coco数据集中进行了评估。实验结果表明，所提出的方法的表现优于最先进的姿势模型，并针对其他模型实现竞争性能。

translated by 谷歌翻译

Semantically-consistent Landsat 8 image to Sentinel-2 image translation for alpine areas

M. Sokolov , J. L. Storie , C. J. Henry , C. D. Storie , J. Cameron , R. S. Ødegård , V. Zubinaite , S. Stikbakke

分类：计算机视觉 | 机器学习

2022-12-22

The availability of frequent and cost-free satellite images is in growing demand in the research world. Such satellite constellations as Landsat 8 and Sentinel-2 provide a massive amount of valuable data daily. However, the discrepancy in the sensors' characteristics of these satellites makes it senseless to use a segmentation model trained on either dataset and applied to another, which is why domain adaptation techniques have recently become an active research area in remote sensing. In this paper, an experiment of domain adaptation through style-transferring is conducted using the HRSemI2I model to narrow the sensor discrepancy between Landsat 8 and Sentinel-2. This paper's main contribution is analyzing the expediency of that approach by comparing the results of segmentation using domain-adapted images with those without adaptation. The HRSemI2I model, adjusted to work with 6-band imagery, shows significant intersection-over-union performance improvement for both mean and per class metrics. A second contribution is providing different schemes of generalization between two label schemes - NALCMS 2015 and CORINE. The first scheme is standardization through higher-level land cover classes, and the second is through harmonization validation in the field.

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Denoising Diffusion Models for Out-of-Distribution Detection

Mark S. Graham , Walter H. L. Pinaya , Petru-Daniel Tudosiu , Parashkev Nachev , Sebastien Ourselin , M. Jorge Cardoso

分类：机器学习 | 计算机视觉

2022-11-14

Out-of-distribution detection is crucial to the safe deployment of machine learning systems. Currently, the state-of-the-art in unsupervised out-of-distribution detection is dominated by generative-based approaches that make use of estimates of the likelihood or other measurements from a generative model. Reconstruction-based methods offer an alternative approach, in which a measure of reconstruction error is used to determine if a sample is out-of-distribution. However, reconstruction-based approaches are less favoured, as they require careful tuning of the model's information bottleneck - such as the size of the latent dimension - to produce good results. In this work, we exploit the view of denoising diffusion probabilistic models (DDPM) as denoising autoencoders where the bottleneck is controlled externally, by means of the amount of noise applied. We propose to use DDPMs to reconstruct an input that has been noised to a range of noise levels, and use the resulting multi-dimensional reconstruction error to classify out-of-distribution inputs. Our approach outperforms not only reconstruction-based methods, but also state-of-the-art generative-based approaches.

translated by 谷歌翻译